148 research outputs found
A generic algorithm for reducing bias in parametric estimation
A general iterative algorithm is developed for the computation
of reduced-bias parameter estimates in regular statistical models through
adjustments to the score function. The algorithm unifies and provides appealing new interpretation for iterative methods that have been published
previously for some specific model classes. The new algorithm can usefully be viewed as a series of iterative bias corrections, thus facilitating the
adjusted score approach to bias reduction in any model for which the first-
order bias of the maximum likelihood estimator has already been derived.
The method is tested by application to a logit-linear multiple regression
model with beta-distributed responses; the results confirm the effectiveness
of the new algorithm, and also reveal some important errors in the existing
literature on beta regression
Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models
Penalization of the likelihood by Jeffreys' invariant prior, or by a positive
power thereof, is shown to produce finite-valued maximum penalized likelihood
estimates in a broad class of binomial generalized linear models. The class of
models includes logistic regression, where the Jeffreys-prior penalty is known
additionally to reduce the asymptotic bias of the maximum likelihood estimator;
and also models with other commonly used link functions such as probit and
log-log. Shrinkage towards equiprobability across observations, relative to the
maximum likelihood estimator, is established theoretically and is studied
through illustrative examples. Some implications of finiteness and shrinkage
for inference are discussed, particularly when inference is based on Wald-type
procedures. A widely applicable procedure is developed for computation of
maximum penalized likelihood estimates, by using repeated maximum likelihood
fits with iteratively adjusted binomial responses and totals. These theoretical
results and methods underpin the increasingly widespread use of reduced-bias
and similarly penalized binomial regression models in many applied fields
Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data
We present a large-scale study of commonality in liquidity and resilience
across assets in an ultra high-frequency (millisecond-timestamped) Limit Order
Book (LOB) dataset from a pan-European electronic equity trading facility. We
first show that extant work in quantifying liquidity commonality through the
degree of explanatory power of the dominant modes of variation of liquidity
(extracted through Principal Component Analysis) fails to account for heavy
tailed features in the data, thus producing potentially misleading results. We
employ Independent Component Analysis, which both decorrelates the liquidity
measures in the asset cross-section, but also reduces higher-order statistical
dependencies.
To measure commonality in liquidity resilience, we utilise a novel
characterisation as the time required for return to a threshold liquidity
level. This reflects a dimension of liquidity that is not captured by the
majority of liquidity measures and has important ramifications for
understanding supply and demand pressures for market makers in electronic
exchanges, as well as regulators and HFTs. When the metric is mapped out across
a range of thresholds, it produces the daily Liquidity Resilience Profile (LRP)
for a given asset. This daily summary of liquidity resilience behaviour from
the vast LOB dataset is then amenable to a functional data representation. This
enables the comparison of liquidity resilience in the asset cross-section via
functional linear sub-space decompositions and functional regression. The
functional regression results presented here suggest that market factors for
liquidity resilience (as extracted through functional principal components
analysis) can explain between 10 and 40% of the variation in liquidity
resilience at low liquidity thresholds, but are less explanatory at more
extreme levels, where individual asset factors take effect
Model-based clustering using copulas with applications
The majority of model-based clustering techniques is based on multivariate normal models and their variants. In this paper copulas are used for the construction of flexible families of models for clustering applications. The use of copulas in model-based clustering offers two direct advantages over current methods: (i) the appropriate choice of copulas provides the ability to obtain a range of exotic shapes for the clusters, and (ii) the explicit choice of marginal distributions for the clusters allows the modelling of multivariate data of various modes (either discrete or continuous) in a natural way. This paper introduces and studies the framework of copula-based finite mixture models for clustering applications. Estimation in the general case can be performed using standard EM, and, depending on the mode of the data, more efficient procedures are provided that can fully exploit the copula structure. The closure properties of the mixture models under marginalization are discussed, and for continuous, real-valued data parametric rotations in the sample space are introduced, with a parallel discussion on parameter identifiability depending on the choice of copulas for the components. The exposition of the methodology is accompanied and motivated by the analysis of real and artificial data
Linking the performance of endurance runners to training and physiological effects via multi-resolution elastic net
A multiplicative effects model is introduced for the identification of the factors that are influential to the performance of highly-trained endurance runners. The model extends the established power-law relationship between performance times and distances by taking into account the effect of the physiological status of the runners, and training effects extracted from GPS records collected over the course of a year. In order to incorporate information on the runners' training into the model, the concept of the training distribution profile is introduced and its ability to capture the characteristics of the training session is discussed. The covariates that are relevant to runner performance as response are identified using a procedure termed multi-resolution elastic net. Multi-resolution elastic net allows the simultaneous identification of scalar covariates and of intervals on the domain of one or more functional covariates that are most influential for the response. The results identify a contiguous group of speed intervals between 5.3 to 5.7 m?s?1 as influential for the improvement of running performance and extend established relationships between physiological status and runner performance. Another outcome of multi-resolution elastic net is a predictive equation for performance based on the minimization of the mean squared prediction error on a test data set across resolutions
Bounded-memory adjusted scores estimation in generalized linear models with large data sets
The widespread use of maximum Jeffreys'-prior penalized likelihood in
binomial-response generalized linear models, and in logistic regression, in
particular, are supported by the results of Kosmidis and Firth (2021,
Biometrika), who show that the resulting estimates are also always
finite-valued, even in cases where the maximum likelihood estimates are not,
which is a practical issue regardless of the size of the data set. In logistic
regression, the implied adjusted score equations are formally bias-reducing in
asymptotic frameworks with a fixed number of parameters and appear to deliver a
substantial reduction in the persistent bias of the maximum likelihood
estimator in high-dimensional settings where the number of parameters grows
asymptotically linearly and slower than the number of observations. In this
work, we develop and present two new variants of iteratively reweighted least
squares for estimating generalized linear models with adjusted score equations
for mean bias reduction and maximization of the likelihood penalized by a
positive power of the Jeffreys-prior penalty, which eliminate the requirement
of storing quantities in memory, and can operate with data sets that
exceed computer memory or even hard drive capacity. We achieve that through
incremental QR decompositions, which enable IWLS iterations to have access only
to data chunks of predetermined size. We assess the procedures through a
real-data application with millions of observations, and in high-dimensional
logistic regression, where a large-scale simulation experiment produces
concrete evidence for the existence of a simple adjustment to the maximum
Jeffreys'-penalized likelihood estimates that delivers high accuracy in terms
of signal recovery even in cases where estimates from ML and other
recently-proposed corrective methods do not exist
Diaconis-Ylvisaker prior penalized likelihood for logistic regression
We characterise the behaviour of the maximum Diaconis-Ylvisaker prior
penalized likelihood estimator in high-dimensional logistic regression, where
the number of covariates is a fraction of the number of
observations , as . We derive the estimator's aggregate
asymptotic behaviour when covariates are independent normal random variables
with mean zero and variance , and the vector of regression coefficients
has length , asymptotically. From this foundation, we devise
adjusted -statistics, penalized likelihood ratio statistics, and aggregate
asymptotic results with arbitrary covariate covariance. In the process, we fill
in gaps in previous literature by formulating a Lipschitz-smooth approximate
message passing recursion, to formally transfer the asymptotic results from
approximate message passing to logistic regression. While the maximum
likelihood estimate asymptotically exists only for a narrow range of values, the maximum Diaconis-Ylvisaker prior penalized likelihood
estimate not only exists always but is also directly computable using maximum
likelihood routines. Thus, our asymptotic results also hold for values where results for maximum likelihood are not attainable, with
no overhead in implementation or computation. We study the estimator's
shrinkage properties and compare it to logistic ridge regression and
demonstrate our theoretical findings with simulations.Comment: 12 pages, 7 Figures, 1 attached pd
A Bayesian inference approach for determining player abilities in football
We consider the task of determining a football player's ability for a given
event type, for example, scoring a goal. We propose an interpretable Bayesian
model which is fit using variational inference methods. We implement a Poisson
model to capture occurrences of event types, from which we infer player
abilities. Our approach also allows the visualisation of differences between
players, for a specific ability, through the marginal posterior variational
densities. We then use these inferred player abilities to extend the Bayesian
hierarchical model of Baio and Blangiardo (2010) which captures a team's
scoring rate (the rate at which they score goals). We apply the resulting
scheme to the English Premier League, capturing player abilities over the
2013/2014 season, before using output from the hierarchical model to predict
whether over or under 2.5 goals will be scored in a given game in the 2014/2015
season. This validates our model as a way of providing insights into team
formation and the individual success of sports teams.Comment: 31 pages, 14 figure
Upside and Downside Risk Exposures of Currency Carry Trades via Tail Dependence
Currency carry trade is the investment strategy that involves selling low
interest rate currencies in order to purchase higher interest rate currencies,
thus profiting from the interest rate differentials. This is a well known
financial puzzle to explain, since assuming foreign exchange risk is
uninhibited and the markets have rational risk-neutral investors, then one
would not expect profits from such strategies. That is, according to uncovered
interest rate parity (UIP), changes in the related exchange rates should offset
the potential to profit from such interest rate differentials. However, it has
been shown empirically, that investors can earn profits on average by borrowing
in a country with a lower interest rate, exchanging for foreign currency, and
investing in a foreign country with a higher interest rate, whilst allowing for
any losses from exchanging back to their domestic currency at maturity. This
paper explores the financial risk that trading strategies seeking to exploit a
violation of the UIP condition are exposed to with respect to multivariate tail
dependence present in both the funding and investment currency baskets. It will
outline in what contexts these portfolio risk exposures will benefit
accumulated portfolio returns and under what conditions such tail exposures
will reduce portfolio returns.Comment: arXiv admin note: substantial text overlap with arXiv:1303.431
Location-adjusted Wald statistics for scalar parameters
Inference about a scalar parameter of interest is a core statistical task that has attracted immense research in statistics. The Wald statistic is a prime candidate for the task, on the grounds of the asymptotic validity of the standard normal approximation to its finite-sample distribution, simplicity and low computational cost. It is well known, though, that this normal approximation can be inadequate, especially when the sample size is small or moderate relative to the number of parameters. A novel, algebraic adjustment to the Wald statistic is proposed, delivering significant improvements in inferential performance with only small implementation and computational overhead, predominantly due to additional matrix multiplications. The Wald statistic is viewed as an estimate of a transformation of the model parameters and is appropriately adjusted, using either maximum likelihood or reduced-bias estimators, bringing its expectation asymptotically closer to zero. The location adjustment depends on the expected information, an approximation to the bias of the estimator, and the derivatives of the transformation, which are all either readily available or easily obtainable in standard software for a wealth of models. An algorithm for the implementation of the location-adjusted Wald statistics in general models is provided, as well as a bootstrap scheme for the further scale correction of the location-adjusted statistic. Ample analytical and numerical evidence is presented for the adoption of the location-adjusted statistic in prominent modelling settings, including inference about log-odds and binomial proportions, logistic regression in the presence of nuisance parameters, beta regression, and gamma regression. The location-adjusted Wald statistics are used for the construction of significance maps for the analysis of multiple sclerosis lesions from MRI data
- …